Integrated circuit and recording medium on which data on integrated circuit is recorded

ABSTRACT

In an integrated circuit, an FPGA ( 2 ) has functions of a CPU core ( 5 ), and includes a user&#39;s circuit and so forth. This configuration allows the number of implemented components such as peripheral circuit chips to be decreased, and cost to be reduced. The integrated circuit is configured such that the CPU core ( 5 ), peripheral circuits thereof, and a system bus ( 8 ) are stored as logic data in a PROM ( 3 ), and the FPGA ( 2 ) performs functions as the CPU core ( 5 ), peripheral circuits ( 6 ) ( 7 ), and system bus ( 8)  based on the logic data. Therefore, the CPU core ( 5 ), peripheral circuits ( 6 ) ( 7 ), and system bus ( 8 ) which have desired functions can be obtained according to contents of the logic data stored in the PROM ( 3 ). Further, a user can readily extend and change functions of the CPU core ( 5 ) by retrofitting a separate circuit to the system bus ( 8 ).

FIELD OF THE INVENTION

This invention relates to an integrated circuit, and particularlyrelates to an art such that a field programmable gate array (FPGA)performs functions as a central processing unit (CPU) core and aperipheral device thereof.

BACKGROUND OF THE INVENTION

Conventionally, as to a logic integrated circuit (IC), there have beenknown a general-purpose logic IC and an application specific IC (ASIC).The general-purpose logic IC, which can be mass-produced, andcost-effective, includes devices functions of which are completed byusers by themselves, such as a microprocessor, a programmable logicdevice (PLD). As to said PLD, there have been a programmable logic array(PLA), an FPGA, and the like. In the FPGA, the user places logic moduleconfigured by a basic logic circuit and unconnected wiring to a chip,and completes the wiring with program elements, thereby obtaining adesired function. The microprocessor is generally called a systemlarge-scale integration (LSI) in that a CPU is integrated on an LSIchip, and embodied as a combination of a logic circuit and a memorycircuit. Further, as to this kind of microprocessor, there has beenknown an RISC (reduced instruction set computer) in which highperformance is achieved by simplifying command processing and hardware.

However, predetermined functions are previously integrated in theabove-mentioned microprocessor which includes a general-purpose CPUcore, and users use limited functions among them, wherein it isdifficult to change the configuration in order to serve specificpurposes. Furthermore, the microprocessor is configured by a CPU coreand a number of chips, which complicates the configuration with manyimplemented components, so that problems have arisen in reliability.

SUMMARY

This invention is made to solve the above-mentioned problems. One objectof the present invention is to provide an integrated circuit as an RISCprocessor, wherein an FPGA itself has functions as a CPU core, andincludes a user's circuit and the like, thereby performing as a systemLSI having functions desired by users without employing a conventionalCPU core, and wherein implemented components such as chips of peripheralcircuits are decreased in number, thereby allowing costs to be reduced.

In order to achieve the above-mentioned objects, the present inventionprovides an integrated circuit equipped with a field programmable gatearray and a memory device, wherein a CPU core and peripheral circuitsconnected thereto are stored as logic data in the memory device, andwherein the field programmable gate array performs functions as the CPUcore and the peripheral circuits based on contents stored in the memorydevice.

In the above, functions of the CPU core and peripheral circuits can bechanged according to the logic data stored in the memory device, whichallows a system LSI to be designed easily. Furthermore, since the fieldprogrammable gate array performs functions as the CPU core andperipheral circuits, the number of chips to be implemented is decreased.

The above-mentioned constitution can be constructed such that theperipheral circuits include a system bus to which a user can connect anarbitrary circuit. Therefore, the user can readily extend and changefunctions of the CPU core by retrofitting a desired circuit.

Further, the above-mentioned configuration can be constituted such thatthe arithmetic processing performed by the CPU core has a structure thata dummy step is incorporated into steps of fetch, decode, execution,memory and write-back, wherein these steps are divided into threestages: a first stage for carrying out in order of fetch and decode, asecond stage for carrying out in order of execution and memory, and athird stage for carrying out in order of dummy and write-back, whereinthe processing is carried out in order of first, second and thirdstages, and wherein, every time one stage is completed, anotherarithmetic processing is started, and simultaneous operations ofdifferent arithmetic processing are executed parallel in the three-stagepipeline construction. In this configuration, in the parallel operationof multiple arithmetic processing, the fetch cycle and memory cycle arenot simultaneously carried out, and generates no situation in whichthese cycles compete with each other for a same memory. Therefore, thisconfiguration can perform the parallel processing without a cash memory.

Furthermore, according to the present invention, a computer readablerecording medium has data to be written in a recording device of anintegrated circuit, composed of a field programmable gate array and amemory device. The data is logic data for making the field programmablegate array perform functions as the CPU core and peripheral circuitsconnected to the CPU core.

The above-mentioned constitution makes it possible to easily design asystem LSI such that a computer reads out the data in the recordingmedium, and the field programmable gate array performs functions as theCPU core and peripheral circuits thereof in the computer.

According to the above-mentioned configuration, a computer-readablerecording medium includes a system bus in peripheral circuits, wherein auser can connect an arbitrary circuit to the system bus.

Furthermore, in the above, the logic data includes a configuration inwhich arithmetic processing executed by the CPU core has a structurethat a dummy step is incorporated into steps of fetch, decode,execution, memory and write-back, wherein these steps are divided intothree stages: a first stage for carrying out in order of fetch anddecode, a second stage for carrying out in order of execution andmemory, and a third stage for carrying out in order of dummy andwrite-back, wherein the processing is carried out in order of first,second and third stages, and wherein, every time one stage is completed,another arithmetic processing is started, and simultaneous operations ofdifferent arithmetic processing are executed parallel in the three-stagepipeline construction.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram showing a configuration of an integratedcircuit according to one embodiment of the present invention.

FIG. 2 is a view showing datapath when a user's circuit is connected toa CPU core of the integrated circuit.

FIG. 3 is a view showing a parallel operation of a three-stage pipe in apipeline control.

FIG. 4 is a view showing three stages in arithmetic processing of saidCPU core.

FIG. 5(a) is a view showing a flow stalled for obtaining branch address,and FIG. 5(b) is a view showing a flow of an NOP insertion of branchdelay.

FIG. 6(a) is a view showing a flow stalled for data dependence, and FIG.6(b) is a view showing a flow in a case that a bypass circuit isprovided between an ALU and a register file.

FIG. 7 is a view showing a flow of a pipeline control in the arithmeticprocessing of the CPU core.

FIG. 8 is a view showing datapath of the CPU core.

FIG. 9 is a view showing a parallel operation of a five-stage pipe in aconventional pipeline control.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT OF THE PRESENTINVENTION

An embodiment of the present invention is explained below with referenceto the drawings. As shown in FIG. 1, an integrated circuit 1 accordingto the present invention, which is an RISC processor having an FPGA 2,and a PROM 3, constructs a system LSI. The FPGA 2 is controlled based onlogic data which is stored in the PROM 3. Thus, the FPGA 2 performsvarious kinds of logic functions according to the logic data which iswritten in the PROM 3. In the present embodiment, the FPGA 2 isconfigured so as to perform functions as a CPU core 5, an interruptmodule 6, a timer module 7 and a system bus (SBUS or internal bus) 8which is connected to the CPU core 5. The CPU core 5 is provided with anindex register (IREG) 5 a for indicating interrupt priority, and aprescaler (TREG) 5 b. The interrupt module 6 is provided with a maskregister 6 a. The CPU core 5 is connected to the interrupt module 6 andtimer module 7 through the system bus 8. Further, the CPU core 5 makeseach of connections with an address bus 11, a data bus 12, a control bus13, and exchanges data with the PROM 3.

Referring to FIG. 2, explanation is given of a connection made by a userbetween an arbitrary circuit and the system bus 8. The user canarbitrarily connect a desired circuit 15 to the system bus 8. FIG. 2shows a datapath in such a case. The desired circuit 15 is provided bystoring logic data to the PROM 3. In the present embodiment, the user'scircuit 15 latches a processed result into an XREG 15 a and a YREG 15 b,and has the CPU core 5 process the result (reading operation). Thisconstitution in which the system bus 8 is provided for making aconnection of the user's circuit 15, allows the user to readily extendthe functions of the CPU core 5. Therefore, the CPU core 5 speedilyperforms processing which cannot have been performed without using anumber of commands in a multi-programming control executed byconventional software programs. The CPU core 5 is provided with an ALU51 for performing four fundamental arithmetic and logic arithmeticfunctions, a register file 52 for temporally storing commands, data andthe like, and a bypass circuit 53 between the ALU 51 and register file52 (later described in detail).

The logic data to be stored in the above-mentioned PROM 3 can be easilygenerated by using a tool such as CAD (computer-aided design). Forexample, the user creates a circuit diagram having desired functions byusing CAD for constructing a desired CPU, and then, converts thegenerated circuit diagram into HDL (hardware description language) withdata conversion software, thereby obtaining logic data which allows theFPGA to perform functions as the CPU. The logic data is provided to theuser in a storage medium (a recording medium in claims) which can beread out by a computer, such as a floppy disk, a CD-ROM, a DVD. The datastored in the storage medium is read out in CAD for arbitrarily changingthe logic data on CAD, or adding a user's desired circuit as aperipheral circuit of the CPU. Thus, the storage medium assists the userin a simplification of designing a system LSI.

Next, arithmetic processing performed by the CPU core 5 will beexplained with reference to FIGS. 3 and 4. The arithmetic processing bythe CPU core 5 is carried out by a pipeline control in a three-stagepipe construction. In the present embodiment, each of arithmeticprocessing is composed of six steps: fetch (F), decode (D), execution(E), memory (M), write back (W) and dummy (X). As shown in FIG. 4, thearithmetic processing is divided into three stages which are carried outin order from first to third stage: first stage for carrying out inorder of F-D, second stage for carrying out in order of E-M, and thirdstage for carrying out in order of X-W. Then, as shown in FIG. 3,operation processing is newly started every time one stage is ended. Inthis procedure, {F, E, X} and {D, M, W} are always operated parallel inthe three-stage pipe. Accordingly, a fetch cycle and a memory cycle arenot simultaneously carried out, and thus generates no situation thatboth of these cycles compete with each other for a same memory.

Here, a conventional pipeline control of five-stage pipe constructionwill be explained with reference to FIG. 9. The conventional five-stagepipe construction is composed of 5 steps: fetch (F), decode (D),execution (E), memory (M) and write back (W). In this construction, eachof steps is carried out in order of F-D-E-M-W, and arithmetic processingis newly started every time one step is ended. After that, each of stepsin multiple arithmetic processing is parallel carried out. In thiscontrol, as shown in FIG. 9, the fetch cycle and memory cycle aresimultaneously carried out, so that they compete with each other for asame memory. Therefore, it is necessary to provide a cash memorycomposed of a program cash memory and a data cash memory. However, inthe present invention as mentioned above, the fetch cycle and memorycycle do not compete with each other for a same memory, so that it isunnecessary to provide a cash memory, and possible to effectively usethe FPGA.

Now, referring to FIGS. 5 and 6, a solution of a structural hazard and adata hazard in the above-mentioned pipeline control will be explained.In the above-mentioned pipeline control, calculations of branch addressis carried out in execution (E) shown in FIG. 5(a). Therefore, it isnecessary to stall instructions for obtaining the branch address. Inorder to avoid this situation, NOP shown in FIG. 5(b) is automaticallyinserted in the pipeline control of the present invention. Further, inthe pipeline control of the present invention, the stall as shown inFIG. 6(a) may be generated. In order to avoid the stall generating, thebypass circuit 53 is placed between the ALU 51 (refer to FIG. 2) andregister file 52 inside the CPU core 5. FIG. 6(b) shows a flow when thebypass circuit 53 is placed in the ALU 51. The pipeline control of thepresent invention carries out operations in two cycles.

With reference to FIGS. 7 and 8, the above-mentioned pipeline control isconcretely explained. As shown in FIG. 7, after PCF (program counterfetch cycle, 54 in FIG. 8) is carried out in the first stage (F-D), PCE(program counter execution, 55 in FIG. 8) is carried out in the secondstage (E-M). At the same time that the second stage is started, nextarithmetic processing is newly started, and IRF (instruction registerfetch cycle, 57 in FIG. 8) is carried out in the first stage (F-D) ofthe newly started processing. That is, PCE and IRF are parallelprocessed. Then, when the third stage (X-W) is started, and PRI (programregister interrupt, 56 in FIG. 8) is carried out in the earliestarithmetic processing, the second stage (E-M) is started and IRE(instruction register execution, 58 in FIG. 8) is carried out in thelater arithmetic processing. That is, the F-D processing alone is alwaysrepeatedly executed in a section for carrying out the first stage in theCPU core 5, and the same also goes for sections for carrying out thesecond and third stages. After the second stage (E-M) in the laterarithmetic processing is ended, the third stage (X-W) is started and IRW(instruction register write back, 59 in FIG. 8) is carried out.

As described above, the integrated circuit 1 of the present inventionhas a configuration wherein the FPGA 2 performs functions as the CPUcore 5 and the peripheral circuits thereof (interrupt module 6, timermodule 7 and system bus 8) based on the logic data stored in the PROM 3,so that, providing the FPGA 2 eliminates the need for mounting chips ofthe CPU core and peripheral circuits thereon, which allows number ofmembers implemented on the integrated circuit 1 to be decreased. Thissimplifies the configuration of the integrated circuit 1, and enhancesits reliability. Also, in the FPGA 2, the system bus 8 is provided so asto connect to the CPU core 5, so that the user can retrofit a circuithaving desired functions to the system bus 8. This makes it possible toextend and change the functions of the CPU core 5, and allows the userto easily configure the system LSI equipped with the CPU core havingdesired functions.

Moreover, the integrated circuit 1 of the present invention has aconfiguration wherein the arithmetic processing in the CPU core 5 iscarried out in the pipeline control of the three-stage construction inwhich the fetch cycle and memory cycle are not simultaneously executedin the parallel operation of the multiple arithmetic processing, therebygenerating no situation that the fetch cycle and memory cycle competewith each other for a same memory. Therefore, it is unnecessary toprovide a cash memory, and possible to effectively use the FPGA 2.

Having described preferred embodiments of the invention with referenceto the accompanying drawings, it is to be understood that the inventionis not limited to those precise embodiments, and that various changesand modifications may be effected therein by one skilled in the artwithout departing from the scope or spirit of the invention as definedin the appended claims. For example, although the peripheral circuits ofthe CPU core 5 are the interrupt module 6, and timer module 7 in theabove-mentioned embodiment, other circuits can also be applied notlimited to these modules.

As mentioned above, according to the present invention, the functions ofthe CPU core and peripheral circuits can arbitrarily be changed bychanging settings of the logic data, so that the system LSI includingthe CPU core equipped with the desired functions can easily be produced.Therefore, this constitution makes it possible to easily produce a CPUcore having the equal function as a CPU core which has gone out ofproduction, and to effectively use user's own data. The CPU core, whichis provided as the logic data, can be produced in small quantity, sothat the production costs can be reduced in comparison with in the caseof the conventional CPU core which has to be produced in large quantity.Furthermore, the FPGA performs the functions as the peripheral circuits,which can decrease number of chips to be implemented on the integratedcircuit, thereby simplifying the structure and enhancing thereliability.

Moreover, a user can readily extend and change functions of the CPU coreby retrofitting a desired circuit to the system bus. Therefore, the usercan easily operate the configuration of the CPU core in order to makethe CPU core perform desired functions. Further, processing which hasbeen carried out only by using a number of commands in multipleprogramming control of conventional software, can be speedily performedin the CPU core by extending and changing functions of the CPU core.

Moreover, even in case of performing parallel operations of multiplearithmetic processing, the configuration of the present invention doesnot simultaneously carry out the fetch cycle and memory cycle, therebypreventing the situation in which these cycles compete with each otherfor a same memory. Therefore, the parallel operation can be performedwithout a cash memory, which makes it possible to reduce the cost, anduse the FPGA efficiently.

Furthermore, this configuration can assist in designing a system LSIwhich makes a computer read out data in a recording medium.

What is claimed is:
 1. An integrated circuit, comprising: a fieldprogrammable gate array and a memory device; and logic data foreffecting functions of a CPU core and peripheral circuits connectedthereto being stored into said memory device such that said fieldprogrammable gate array performs functions as said CPU core andperipheral circuits based on contents stored in said memory device,wherein arithmetic processing executed by said CPU core has a structurethat a dummy step is incorporated into steps of fetch, decode,execution, memory and write-back, wherein said steps are divided intothree stages: a first stage for carrying out in order of fetch anddecode, a second stage for carrying out in order of execution andmemory, and a third stage for carrying out in order of dummy andwrite-back, wherein the processing is carried out in order of first,second and third stages, and wherein, every time one stage is completed,another arithmetic processing is started, and simultaneous operations ofdifferent arithmetic processing are executed parallel in the three-stagepipeline construction.
 2. A computer-readable recording medium in whichdata to be written in a memory device of an integrated circuit composedof a field programmable gate array and a memory device is stored, saiddata including: logic data for configuring the field programmable gatearray to effect functions of a CPU core; and logic data for configuringthe field programable gate array to effect functions of peripheralcircuits connected to the CPU core effected by the field programmablegate array, wherein said logic data for configuring the fieldprogrammable gate array to effect functions of a CPU core includeseffecting a configuration in which arithmetic processing executed by theCPU core has a structure that a dummy step is incorporated into steps offetch, decode, execution, memory and write-back, wherein these steps aredivided into three stages: a first stage for carrying out in order offetch and decode, a second stage for carrying out in order of executionand memory, and a third stage for carrying out in order of dummy andwrite-back, wherein the processing is carried out in order of first,second and third stages, and wherein, every time one stage is completed,another arithmetic processing is started, and simultaneous operations ofdifferent arithmetic processing are executed parallel in the three-stagepipeline construction.
 3. An integrated circuit comprising: a fieldprogrammable gate array and a memory device; logic data for effectingfunctions of a CPU core and peripheral circuits connected thereto beingstored into said memory device such that said field programmable gatearray performs functions as said CPU core and peripheral circuits basedon contents stored in said memory device; and said peripheral circuitseffected by said field programmable gate array being provided with asystem bus effected by said field programmable gate array, andconfigured such that a user connects an arbitrary circuit to said systembus, wherein arithmetic processing executed by said CPU core has astructure that a dummy step is incorporated into steps of fetch, decode,execution, memory and write-back, wherein said steps are divided intothree stages: a first stage for carrying out in order of fetch anddecode, a second stage for carrying out in order of execution andmemory, and a third stage for carrying out in order of dummy andwrite-back, wherein the processing is carried out in order of first,second and third stages, and wherein, every time one stage is completed,another arithmetic processing is started, and simultaneous operations ofdifferent arithmetic processing are executed parallel in the three-stagepipeline construction.
 4. A computer-readable recording medium in whichdata to be written in a memory device of an integrated circuit composedof a field programmable gate array and a memory device is stored, saiddata including: logic data for configuring the field programmable gatearray to effect functions of a CPU core; and logic data for configuringthe field programable gate array to effect functions of peripheralcircuits connected to the CPU core effected by the field programmablegate array, wherein said peripheral circuits effected by said fieldprogrammable gate array include a system bus effected by said fieldprogrammable gate array to which a user connects an arbitrary circuit,and wherein said logic data includes a configuration in which arithmeticprocessing executed by the CPU core effected by said field programmablegate array has a structure that a dummy step is incorporated into stepsof fetch, decode, execution, memory and write-back, wherein these stepsare divided into three stages: a first stage for carrying out in orderof fetch and decode, a second stage for carrying out in order ofexecution and memory, and a third stage for carrying out in order ofdummy and write-back, wherein the processing is carried out in order offirst, second and third stages, and wherein, every time one stage iscompleted, another arithmetic processing is started, and simultaneousoperations of different arithmetic processing are executed parallel inthe three-stage pipeline construction.